[BYOC] Handle constants in IRModule-at-a-time external codegen #11770

mbs-octoml · 2022-06-17T21:01:34Z

I tried to do to the TensorRT integration what #11631 did to the CUTLASS integration, viz:

Make sure all compilation options are passed in Target instances. This helps Collage.
Use a custom pass invoked via RelayToTIRTargetHooks instead of the relay.ext.$toolchain mechanism.
This helps use decouple external codegen from lowering.

This PR collects the prep for that change:

TensorRT uses the JSONSerializer visitor to encode each partition function. Previously, when the
visitor encountered a Constant it simply generated and recorded a name for the constant. Then,
completely separately, and via a callback in TECompiler, the function is visited again in the
same order and with the same name generation convention by a ConstantUpdater to actually collect the
bindings, which are then encoded into a ConstLoaderModule to be made available at runtime.

However if all TensorRT compilation is to be done by a stand-alone pass there's no TECompiler callback
hackery available. So I've added a "const_name_to_ndarray" attribute to the IRModule of type
Map<String, runtime::NDArray> so that named constants can be accumulated throughout compilation by
any pass which needs to do so. Then the Graph, AOT and VM executors are all updated to merge those
constants into the final runtime artifact

(Compare with "Constants", the equivalent attribute for extracting TIR AllocateConsts.)
The TensorRT tests use the create_executor interface but it wasn't quite ready for the
new more general form of passing list-of-targets.
I want TensorRT compilation to work out of the box without the need for any special targets if
all the default options should apply. Go back and make the CUTLASS integration I did follow the
same convention.
To test this I also switched the 'demo' "ccompiler" external codegen target to IRModule-at-a-time
style. This means we can test most of external codegen machinery in one place without depending on
any target which may not be enabled in CI (eg TensorRT):
- Target instances are plumbed correctly so compile-time options are available.
- External modules are conveyed to the final export library.
- Constant bindings are conveyed to the metadata module.

mbs-octoml · 2022-06-18T13:31:13Z

tests/python/frontend/pytorch/test_forward.py::test_argsort appears to be a flake.

masahi

cc @Mousius @manupa-arm

A minor suggestion to use value_or, rather than supplying a default value and having to do .value(): lowered_mod->GetAttr<Map<String, runtime::NDArray>>(tvm::attr::kConstNameToNDArray)->value_or({}) (not sure if {} thing works)
Split inline stuff into other PR.

areusch

just did a quick review before i ran out the door. main question here: is there a test that actually populates kConstNameToNDArray attr and watches it come out the end of the compiler?

include/tvm/ir/module.h

python/tvm/relay/build_module.py

src/relay/backend/aot_executor_codegen.cc

src/relay/backend/contrib/codegen_json/codegen_json.h

src/relay/backend/contrib/cutlass/codegen.cc

mbs-octoml · 2022-06-27T18:27:33Z

Thanks @masahi & @areusch, PTAL.

Re value_or: done, thanks, that's much nicer.
Re split PRs, done, see #11923. This was already a split of a split so sorry for the grab bag.
Re end-to-end test: I think this should be done in test_target_hooks.py, but to really exercise everything requires a new fake JSON-based runtime module, a fake custom codegen pass and a new fake external codegen target. I started down that path but it will be quite a lift. Unfortunately however we can't rely on the tensorrt unit test for this (the first user of the new const_name_to_constants attribute) since the functionality won't be testable without running, which is not enabled in CI. WDYT?

areusch · 2022-06-28T16:58:18Z

i hate to be a stickler, but i feel like something should test this in CI. would such a fake module become useful in other compiler testing?

I tried to do to the TensorRT integration what apache#11631 did to the CUTLASS integration, viz: - Make sure all compilation options are passed in Target instances. This helps Collage. - Use a custom pass invoked via RelayToTIRTargetHooks instead of the relay.ext.$toolchain mechanism. This helps use decouple external codegen from lowering. This PR collects the prep for that change: - TensorRT uses the JSONSerializer visitor to encode each partition function. Previously, when the visitor encountered a Constant it simply generated and recorded a name for the constant. Then, completely separately, and via a callback in TECompiler, the function is visited again in the same order and with the same name generation convention by a ConstantUpdater to actually collect the bindings, which are then encoded into a ConstLoaderModule to be made available at runtime. However if all TensorRT compilation is to be done by a stand-alone pass there's no TECompiler callback hackery available. So I've added a "const_name_to_ndarray" attribute to the IRModule of type Map<String, runtime::NDArray> so that named constants can be accumulated throughout compilation by any pass which needs to do so. Then the Graph, AOT and VM executors are all updated to merge those constants into the final runtime artifact (Compare with "Constants", the equivalent attribute for extracting TIR AllocateConsts.) - The TensorRT tests use the create_executor interface but it wasn't quite ready for the new more general form of passing list-of-targets. - I want TensorRT compilation to work out of the box without the need for any special targets if all the default options should apply. Go back and make the CUTLASS integration I did follow the same convention. - To test this I also switched the 'demo' "ccompiler" external codegen target to IRModule-at-a-time style. This means we can test most of external codegen machinery in one place without depending on any target which may not be enabled in CI (eg TensorRT): - Target instances are plumbed correctly so compile-time options are available. - External modules are conveyed to the final export library. - Constant bindings are conveyed to the metadata module.

mbs-octoml · 2022-06-29T18:35:47Z

It turns out switching the 'ccompiler' demo external codegen to use target hooks is enough to exercise all this, all be it the PR has grown quite a bit. Let's see what CI makes of it since probably broken something else.

mbs-octoml · 2022-06-29T21:14:49Z

PTAL @areusch

…e#11770) I tried to do to the TensorRT integration what apache#11631 did to the CUTLASS integration, viz: - Make sure all compilation options are passed in Target instances. This helps Collage. - Use a custom pass invoked via RelayToTIRTargetHooks instead of the relay.ext.$toolchain mechanism. This helps use decouple external codegen from lowering. This PR collects the prep for that change: - TensorRT uses the JSONSerializer visitor to encode each partition function. Previously, when the visitor encountered a Constant it simply generated and recorded a name for the constant. Then, completely separately, and via a callback in TECompiler, the function is visited again in the same order and with the same name generation convention by a ConstantUpdater to actually collect the bindings, which are then encoded into a ConstLoaderModule to be made available at runtime. However if all TensorRT compilation is to be done by a stand-alone pass there's no TECompiler callback hackery available. So I've added a "const_name_to_ndarray" attribute to the IRModule of type Map<String, runtime::NDArray> so that named constants can be accumulated throughout compilation by any pass which needs to do so. Then the Graph, AOT and VM executors are all updated to merge those constants into the final runtime artifact (Compare with "Constants", the equivalent attribute for extracting TIR AllocateConsts.) - The TensorRT tests use the create_executor interface but it wasn't quite ready for the new more general form of passing list-of-targets. - I want TensorRT compilation to work out of the box without the need for any special targets if all the default options should apply. Go back and make the CUTLASS integration I did follow the same convention. - To test this I also switched the 'demo' "ccompiler" external codegen target to IRModule-at-a-time style. This means we can test most of external codegen machinery in one place without depending on any target which may not be enabled in CI (eg TensorRT): - Target instances are plumbed correctly so compile-time options are available. - External modules are conveyed to the final export library. - Constant bindings are conveyed to the metadata module.

masahi reviewed Jun 20, 2022

View reviewed changes

areusch reviewed Jun 21, 2022

View reviewed changes

mbs-octoml force-pushed the mbs-prep-for-trt branch from 19b33a4 to cd1318b Compare June 27, 2022 18:28

masahi approved these changes Jun 27, 2022

View reviewed changes

mbs-octoml force-pushed the mbs-prep-for-trt branch from 818493d to ea69f71 Compare June 29, 2022 18:31

areusch approved these changes Jun 30, 2022

View reviewed changes

areusch merged commit 985680e into apache:main Jun 30, 2022

mbs-octoml deleted the mbs-prep-for-trt branch June 30, 2022 19:22

mbs-octoml mentioned this pull request Jul 11, 2022

[AOT][BUG] Only include extra headers if the constants array is needed. #12061

Merged

AndrewZhaoLuo mentioned this pull request Oct 4, 2022

TVM v0.10.0.rc0 Release Candidate Notes #12979

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BYOC] Handle constants in IRModule-at-a-time external codegen #11770

[BYOC] Handle constants in IRModule-at-a-time external codegen #11770

mbs-octoml commented Jun 17, 2022 •

edited

Loading

mbs-octoml commented Jun 18, 2022

masahi left a comment •

edited

Loading

areusch left a comment

mbs-octoml commented Jun 27, 2022 •

edited

Loading

areusch commented Jun 28, 2022

mbs-octoml commented Jun 29, 2022 •

edited

Loading

mbs-octoml commented Jun 29, 2022

[BYOC] Handle constants in IRModule-at-a-time external codegen #11770

[BYOC] Handle constants in IRModule-at-a-time external codegen #11770

Conversation

mbs-octoml commented Jun 17, 2022 • edited Loading

mbs-octoml commented Jun 18, 2022

masahi left a comment • edited Loading

Choose a reason for hiding this comment

areusch left a comment

Choose a reason for hiding this comment

mbs-octoml commented Jun 27, 2022 • edited Loading

areusch commented Jun 28, 2022

mbs-octoml commented Jun 29, 2022 • edited Loading

mbs-octoml commented Jun 29, 2022

mbs-octoml commented Jun 17, 2022 •

edited

Loading

masahi left a comment •

edited

Loading

mbs-octoml commented Jun 27, 2022 •

edited

Loading

mbs-octoml commented Jun 29, 2022 •

edited

Loading